Segmental optical phonetics for human and machine speech processing

نویسنده

Lynne E. Bernstein

چکیده

That talkers produce optical as well as acoustic speech signals, and that perceivers process both types of signals has become well known. Although perceptual effects due to audiovisual speech integration have been a focus of research involving the visual speech stimulus, relatively little is known about visual-only speech perception and optical phonetic signals. This knowledge is needed to exploit optical signals for applications such as synthetic artificial talking heads and audiovisual ASR. One important practical concern is the wide variation in performance among individual visual perceivers and talkers. This paper focuses on variation in visual phonetic perception, phoneme distinctiveness and word recognition. The paper also introduces a project linking optical phonetics, speech kinematics, and perception.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Designing and implementing a system for Automatic recognition of Persian letters by Lip-reading using image processing methods

For many years, speech has been the most natural and efficient means of information exchange for human beings. With the advancement of technology and the prevalence of computer usage, the design and production of speech recognition systems have been considered by researchers. Among this, lip-reading techniques encountered with many challenges for speech recognition, that one of the challenges b...

متن کامل

Speech Synthesis

Speech Synthesis is undoubtedly a technological challenge with many potential applications in human-machine communication. More basically, it is a crossroads where researchers with many different backgrounds collaborate to put together their knowledge in computational linguistics, phonetics, prosody, physiology, vocal tract modeling, signal processing, image synthesis, experimental psychology, ...

متن کامل

A Database for Automatic Persian Speech Emotion Recognition: Collection, Processing and Evaluation

Abstract Recent developments in robotics automation have motivated researchers to improve the efficiency of interactive systems by making a natural man-machine interaction. Since speech is the most popular method of communication, recognizing human emotions from speech signal becomes a challenging research topic known as Speech Emotion Recognition (SER). In this study, we propose a Persian em...

متن کامل

The Eﬀect of Using PRAAT Software on Pre-Intermediate EFL Learners’ Supra Segmental Features

The present study investigated the eﬀect of using PRAAT as a free computer software package for the scientific analysis of speech in phonetics on pre-intermediate Iranian English as foreign language (EFL) learners’ supra segmental features (i.e., intonation and stress). The design of the study was a Quasi-experimental research design with a pre and post-test. In doing so...

متن کامل

A bag-of-features framework for incremental learning of speech invariants in unsegmented audio streams

We introduce a computational framework that allows a machine to bootstrap flexible autonomous learning of speech recognition skills. Technically, this framework shall enable a robot to incrementally learn to recognize speech invariants from unsegmented audio streams and with no prior knowledge of phonetics. To achieve this, we import the bag-of-words/bag-of-features approach from recent researc...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2000

Segmental optical phonetics for human and machine speech processing

نویسنده

چکیده

منابع مشابه

Designing and implementing a system for Automatic recognition of Persian letters by Lip-reading using image processing methods

Speech Synthesis

A Database for Automatic Persian Speech Emotion Recognition: Collection, Processing and Evaluation

The Eﬀect of Using PRAAT Software on Pre-Intermediate EFL Learners’ Supra Segmental Features

A bag-of-features framework for incremental learning of speech invariants in unsegmented audio streams

عنوان ژورنال:

اشتراک گذاری